智能论文笔记

A photosensor employing data-driven binning for ultrafast image recognition

Lukas Mennel , Aday J. Molina-Mendoza , Matthias Paur , Dmitry K. Polyushkin , Dohyun Kwak , Miriam Giparakis , Maximilian Beiser , Aaron Maxwell Andrews , Thomas Mueller

分类：计算机视觉

2021-11-20

像素盒是一种技术，广泛用于光学图像采集和光谱学，其中图像传感器的相邻检测器元件被组合成较大的像素。这减少了要处理的数据量以及噪声的影响，而是以丢失信息的成本。在这里，我们通过将大部分传感器元件组合成延伸在芯片的整个面上的单个超像素中，将钻入其限制的概念。对于给定的模式识别任务，通过使用机器学习算法从训练数据确定其最佳形状。我们展示了纳秒时间形象上的Mnist DataSet对光学投影图像的分类，增强了灵敏度，而不会损失分类准确性。我们的概念不仅限于仅成像，而且还可以应用于光学光谱或其他传感应用。

translated by 谷歌翻译

Active Learning for Computationally Efficient Distribution of Binary Evolution Simulations

Kyle Akira Rocha , Jeff J. Andrews , Christopher P. L. Berry , Zoheyr Doctor , Pablo Marchant , Vicky Kalogera , Scott Coughlin , Simone S. Bavera , Aaron Dotter , Tassos Fragos

分类：机器学习

2022-03-30

二进制恒星经历各种相互作用和进化阶段，对于预测和解释观察到的特性至关重要。具有完整恒星结构和进化模拟的二元种群合成在计算上需要大量的质量转移序列。最近开发的二元种群综合代码Posydon结合了梅萨二元星模拟的网格，然后将其插值以模拟大型大型二进制文件。计算高密度直线网格的传统方法对于高维网格，不可扩展，这是一系列金属性，旋转和偏心率的范围。我们提出了一种新的活跃学习算法PSY-CRI，该算法使用数据收集过程中的机器学习来适应和迭代选择目标模拟以运行，从而导致自定义，高性能的训练集。我们在玩具问题上测试PSY-CRIS，发现所得的训练集比常规或随机采样网格所需的模拟更少以进行准确的分类和回归。我们进一步将psy-cris应用于构建Mesa模拟动态网格的目标问题，我们证明，即使没有微调，仅$ \ sim 1/4 $的模拟集也足以足以达到相同的分类精度。当针对目标应用程序优化算法参数时，我们预计将进一步增益。我们发现，仅对分类进行优化可能会导致回归中的绩效损失，反之亦然。降低产生网格的计算成本将使Posydon的未来版本涵盖更多的输入参数，同时保留插值精度。

translated by 谷歌翻译

ReSQueing Parallel and Private Stochastic Convex Optimization

Yair Carmon , Arun Jambulapati , Yujia Jin , Yin Tat Lee , Daogao Liu , Aaron Sidford , Kevin Tian

分类：机器学习 | (统计)机器学习

2023-01-01

We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. We give an $(\epsilon_{\text{dp}}, \delta)$-differentially private algorithm which, given $n$ samples of Lipschitz loss functions, obtains near-optimal optimization error and makes $\min(n, n^2\epsilon_{\text{dp}}^2 d^{-1}) + \min(n^{4/3}\epsilon_{\text{dp}}^{1/3}, (nd)^{2/3}\epsilon_{\text{dp}}^{-1})$ queries to the gradients of these functions. In the regime $d \le n \epsilon_{\text{dp}}^{2}$, where privacy comes at no cost in terms of the optimal loss up to constants, our algorithm uses $n + (nd)^{2/3}\epsilon_{\text{dp}}^{-1}$ queries and improves recent advancements of [KLL21, AFKT21]. In the moderately low-dimensional setting $d \le \sqrt n \epsilon_{\text{dp}}^{3/2}$, our query complexity is near-linear.

translated by 谷歌翻译

Hybrid Quantum-Classical Generative Adversarial Network for High Resolution Image Generation

Shu Lok Tsang , Maxwell T. West , Sarah M. Erfani , Muhammad Usman

分类：计算机视觉 | 机器学习

2022-12-22

Quantum machine learning (QML) has received increasing attention due to its potential to outperform classical machine learning methods in various problems. A subclass of QML methods is quantum generative adversarial networks (QGANs) which have been studied as a quantum counterpart of classical GANs widely used in image manipulation and generation tasks. The existing work on QGANs is still limited to small-scale proof-of-concept examples based on images with significant down-scaling. Here we integrate classical and quantum techniques to propose a new hybrid quantum-classical GAN framework. We demonstrate its superior learning capabilities by generating $28 \times 28$ pixels grey-scale images without dimensionality reduction or classical pre/post-processing on multiple classes of the standard MNIST and Fashion MNIST datasets, which achieves comparable results to classical frameworks with 3 orders of magnitude less trainable generator parameters. To gain further insight into the working of our hybrid approach, we systematically explore the impact of its parameter space by varying the number of qubits, the size of image patches, the number of layers in the generator, the shape of the patches and the choice of prior distribution. Our results show that increasing the quantum generator size generally improves the learning capability of the network. The developed framework provides a foundation for future design of QGANs with optimal parameter set tailored for complex image generation tasks.

translated by 谷歌翻译

RangeAugment: Efficient Online Augmentation with Range Learning

Sachin Mehta , Saeid Naderiparizi , Fartash Faghri , Maxwell Horton , Lailin Chen , Ali Farhadi , Oncel Tuzel , Mohammad Rastegari

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-20

State-of-the-art automatic augmentation methods (e.g., AutoAugment and RandAugment) for visual recognition tasks diversify training data using a large set of augmentation operations. The range of magnitudes of many augmentation operations (e.g., brightness and contrast) is continuous. Therefore, to make search computationally tractable, these methods use fixed and manually-defined magnitude ranges for each operation, which may lead to sub-optimal policies. To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations. RangeAugment uses an auxiliary loss based on image similarity as a measure to control the range of magnitudes of augmentation operations. As a result, RangeAugment has a single scalar parameter for search, image similarity, which we simply optimize via linear search. RangeAugment integrates seamlessly with any model and learns model- and task-specific augmentation policies. With extensive experiments on the ImageNet dataset across different networks, we show that RangeAugment achieves competitive performance to state-of-the-art automatic augmentation methods with 4-5 times fewer augmentation operations. Experimental results on semantic segmentation, object detection, foundation models, and knowledge distillation further shows RangeAugment's effectiveness.

translated by 谷歌翻译

Language Modeling with Latent Situations

Belinda Z. Li , Maxwell Nye , Jacob Andreas

分类：自然语言处理

2022-12-20

Language models (LMs) often generate incoherent outputs: they refer to events and entity states that are incompatible with the state of the world described in their inputs. We introduce SituationSupervision, a family of approaches for improving coherence in LMs by training them to construct and condition on explicit representations of entities and their states. SituationSupervision has two components: an auxiliary situation modeling task that trains models to predict state representations in context, and a latent state inference procedure that imputes these states from partially annotated training data. SituationSupervision can be applied to both fine-tuning (by supervising LMs to encode state variables in their hidden representations) and prompting (by inducing LMs to interleave textual descriptions of entity states with output text). In both cases, SituationSupervision requires only a small number of state annotations to produce major coherence improvements (between 4-11%), showing that standard LMs can be sample-efficiently trained to model not just language but the situations it describes.

translated by 谷歌翻译

On Improving Summarization Factual Consistency from Natural Language Feedback

Yixin Liu , Budhaditya Deb , Milagro Teruel , Aaron Halfaker , Dragomir Radev , Ahmed H. Awadallah

分类：自然语言处理

2022-12-20

Despite the recent progress in language generation models, their outputs may not always meet user expectations. In this work, we study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment. To this end, we consider factual consistency in summarization, the quality that the summary should only contain information supported by the input documents, for user preference alignment. We collect a high-quality dataset, DeFacto, containing human demonstrations and informational feedback in natural language consisting of corrective instructions, edited summaries, and explanations with respect to the factual consistency of the summary. Using our dataset, we study two natural language generation tasks: 1) editing a summary using the human feedback, and 2) generating human feedback from the original summary. Using the two tasks, we further evaluate if models can automatically correct factual inconsistencies in generated summaries. We show that the human-edited summaries we collected are more factually consistent, and pre-trained language models can leverage our dataset to improve the factual consistency of original system-generated summaries in our proposed generation tasks. We make the DeFacto dataset publicly available at https://github.com/microsoft/DeFacto.

translated by 谷歌翻译

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

Kelvin Xu , Zheyuan Hu , Ria Doshi , Aaron Rovinsky , Vikash Kumar , Abhishek Gupta , Sergey Levine

分类：机器学习 | 人工智能 | 计算机视觉 | 机器人

2022-12-19

Complex and contact-rich robotic manipulation tasks, particularly those that involve multi-fingered hands and underactuated object manipulation, present a significant challenge to any control method. Methods based on reinforcement learning offer an appealing choice for such settings, as they can enable robots to learn to delicately balance contact forces and dexterously reposition objects without strong modeling assumptions. However, running reinforcement learning on real-world dexterous manipulation systems often requires significant manual engineering. This negates the benefits of autonomous data collection and ease of use that reinforcement learning should in principle provide. In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction. The core principle underlying our system is that, in a vision-based setting, users should be able to provide high-level intermediate supervision that circumvents challenges in teleoperation or kinesthetic teaching which allow a robot to not only learn a task efficiently but also to autonomously practice. Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples, a reinforcement learning procedure that learns the task autonomously without interventions, and experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world, without simulation, manual modeling, or reward engineering.

translated by 谷歌翻译

Reduced Order Model of a Generic Submarine for Maneuvering Near the Surface

J. Ezequiel Martin , Maxwell Hammond , Nicholas Rober , Yakin Kim , Venanzio Cichella , Pablo Carrica

分类：机器人

2022-12-19

A reduced order model of a generic submarine is presented. Computational fluid dynamics (CFD) results are used to create and validate a model that includes depth dependence and the effect of waves on the craft. The model and the procedure to obtain its coefficients are discussed, and examples of the data used to obtain the model coefficients are presented. An example of operation following a complex path is presented and results from the reduced order model are compared to those from an equivalent CFD calculation. The controller implemented to complete these maneuvers is also presented.

translated by 谷歌翻译

KNIFE: Knowledge Distillation with Free-Text Rationales

Aaron Chan , Zhiyuan Zeng , Wyatt Lake , Brihi Joshi , Hanjie Chen , Xiang Ren

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

Free-text rationales (FTRs) follow how humans communicate by explaining reasoning processes via natural language. A number of recent works have studied how to improve language model (LM) generalization by using FTRs to teach LMs the correct reasoning processes behind correct task outputs. These prior works aim to learn from FTRs by appending them to the LM input or target output, but this may introduce an input distribution shift or conflict with the task objective, respectively. We propose KNIFE, which distills FTR knowledge from an FTR-augmented teacher LM (takes both task input and FTR) to a student LM (takes only task input), which is used for inference. Crucially, the teacher LM's forward computation has a bottleneck stage in which all of its FTR states are masked out, which pushes knowledge from the FTR states into the task input/output states. Then, FTR knowledge is distilled to the student LM by training its task input/output states to align with the teacher LM's. On two question answering datasets, we show that KNIFE significantly outperforms existing FTR learning methods, in both fully-supervised and low-resource settings.

translated by 谷歌翻译